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The Renyi index (RI) is a one-parameter class of indices that 
summarize health disparities among population groups by measuring 
divergence between the distributions of disease burden and popula¬ 
tion shares of these groups. The rank-dependent RI introduced in this 
paper is a two-parameter class of health disparity indices that also 
accounts for the association between socioeconomic rank and health; 
it may be derived from a rank-dependent social welfare function. Two 
competing classes are discussed and the rank-dependent RI is shown 
to be more robust to changes in the distribution of either socioeco¬ 
nomic rank or health. The standard error and sampling distribution 
of the rank-dependent RI are evaluated using linearization and re¬ 
sampling techniques, and the methodology is illustrated using health 
survey data from the U.S. National Health and Nutrition Examina¬ 
tion Survey and registry data from the U.S. Surveillance, Epidemiol¬ 
ogy and End Results Program. Such data underlie many population- 
based objectives within the U.S. Healthy People 2020 initiative. The 
rank-dependent RI provides a unified mathematical framework for 
eliciting various societal positions with regards to the policies that 
are tied to such wide-reaching public health initiatives. For example, 
if population groups with lower socioeconomic position were ascer¬ 
tained to be more likely to utilize costly public programs, then the 
parameters of the RI could be selected to reflect prioritizing those 
population groups for intervention or treatment. 

1. Introduction. The socioeconomic gradient in health outcomes and re¬ 
sulting health disparities are now well documented in the United States 
(U.S.) and elsewhere [Costa-Font and Hernandez-Quevedo (2012), Brave- 
man et al. (2010), Wilson (2009), WHO-CSDH (2008), Lynch et al. (2004), 
Krieger, Williams and Moss (1997)]. Public health programs can lever¬ 
age social determinants of health to address health inequities and improve 
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health outcomes, as discussed in a recent supplement to Public Health Re¬ 
ports [Dean, Williams and Fenton (2013)]. The U.S. Healthy People 2020 
(HP2020) initiative emphasizes the importance of addressing the social de¬ 
terminants of health and eliminating disparities: two of its four overarching 
goals are to “create social and physical environments that promote good 
health for all” and “achieve health equity, eliminate disparities, and improve 
the health of all groups” [DHHS (2014)]. 

Improving overall population health while simultaneously striving to elim¬ 
inate health disparities is a fundamental public health and social policy 
challenge, because interventions designed to improve the health of individ¬ 
uals may increase disparities between groups and, conversely, reducing a 
group’s burden of disease may have little impact on overall population health 
[Prohlich and Potvin (2008), Mechanic (2002), Rose (1985)]. Therefore, it is 
imperative that measures of health disparities be explicit about the value 
judgments and trade-offs that are inherent to their methodology—for ex¬ 
ample, choice of reference for evaluating disparities, relative versus absolute 
disparities, attainment (i.e., favorable outcomes) versus shortfall (i.e., ad¬ 
verse outcomes) inequalities, equally-weighted versus population-weighted 
groups, etc. [Lambert and Zheng (2011), Harper et al. (2010), Erreygers 
(2009a), Keppel et al. (2005), Mackenbach and Kunst (1997)]. 

In the context of socioeconomic disparities in health, the slope index of 
inequality [Pamuk (1988, 1985)], the classical concentration index [Wagstaff, 
Paci and van Doorslaer (1991)] and the health achievement index [Wagstaff 
(2002)] have provided the impetus for much of the literature on socioeco¬ 
nomic health inequality measures. For example, the partial concentration 
index removes the effect of covariates (e.g., age or sex) that may be corre¬ 
lated with both health and income but may be irrelevant to policy in that 
neither their direct effect on health nor their joint distribution with income 
can be altered [Gravelle (2003)]. Further, an intuitive policy-oriented in¬ 
terpretation of the concentration index ensues from certain redistribution 
schemes [Koolman and van Doorslaer (2004)]. 

A slope index of inequality consists of the slope of the (weighted) least- 
squares regression of health outcomes onto socioeconomic ranking and is 
designed to summarize the association between health and socioeconomic 
status (SES). Similarly, the classical concentration index can be written as 
twice the covariance between socioeconomic rank and health shares. A health 
achievement index represents an equally-distributed level of health equiva¬ 
lent to the population average but such that all groups achieve the same 
average outcome. Those three indices are interrelated; they are reviewed in 
Section 3 of this paper. 

Even though the concentration index is widely used, due to its simple for¬ 
mulation and its appeal to policy makers, its shortcomings have come under 
intense scrutiny in recent years [Bleichrodt, Rohde and Ourti (2012)] and 
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various options for correcting its behavior, especially when measuring socioe¬ 
conomic inequality in a binary health outcome variable, have been debated 
[Kjellsson and Gerdtham (2013), Wagstaff (2011), Erreygers (2009b)]. 

This paper is not intended as a critique of the concentration index. In¬ 
stead, it builds on the differential weighting scheme for socioeconomic groups 
[Berrebi and Silber (1981)] that the concentration index utilizes and ex¬ 
plores a two-parameter alternative to the concentration index that is derived 
from Renyi divergence and includes the entropy-based Renyi index of Talih 
(2013b) as a special case. The proposed approach builds a bridge between 
the theory of rank-dependent social welfare functions and the information 
theoretic evaluation of divergence between probability distributions. On the 
one hand, there is an extensive statistical literature on discrepancy mea¬ 
sures, with applications to goodness-of-fit tests, robust parameter estima¬ 
tion and signal processing; see Talih (2013b) and the references therein. On 
the other hand, social welfare theory provides a framework for the measure¬ 
ment and characterization of socioeconomic inequalities in health [Erreygers 
and van Ourti (2011), Bleichrodt and van Doorslaer (2006)], though social 
justice principles remain foundational in socioeconomic inequality measure¬ 
ment [Bommier and Stecklov (2002), Peter (2001)]. 

In parallel with the development of rank-dependent inequality indices, 
there is renewed interest in composite indices [Asada, Yoshida and Whipp 
(2013)], particularly for analyses and international comparisons of wellbeing, 
for example, using the Human Development Index [Eoster, McGillivray and 
Seth (2013), Paruolo, Saisana and Saltelli (2013)]. In the U.S., composite 
measures of health and health-related quality of life remain core tools for 
monitoring progress toward the HP2020 goals [DHHS (2014)]. The focus on 
multidimensional analyses is also manifested in the development of indices 
for multidimensional inequality [Bennett and Mitra (2013), Decancq and 
Lugo (2009), Tsui (1999), Maasoumi (1986)]. 

The Renyi index (RI), reviewed in Section 2, is a class of inequality indices, 
{RIq, : a > 0}, that is derived from Renyi divergence [Talih (2013b)]. The 
parameter a > 0 is an inequality aversion parameter. The RI is invariant to 
the choice of the reference used for evaluating disparities. This invariance 
property is relevant to HP2020 and related public health initiatives because, 
as mentioned previously, the identification of a reference involves a value 
judgment and, moreover, can be affected by statistical reliability [NGHS 
(2011)]. As discussed in Section 2, the well-known generalized entropy (GE) 
class also can be modified for reference invariance. Yet, the RI is more robust 
than its GE-based counterpart to changes in the distribution of the adverse 
health outcome. 

Section 3 extends the RI to population groups that are ordered by fam¬ 
ily income, educational attainment or other SES variables (or composites 
thereof) that contribute to the social determinants of health. A two-parameter 
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rank-dependent RI is proposed in Section 3.2, where 

increased values of a > 0 reflect an increased societal aversion to (pure) 
health inequality and increased values ol u > 1 allow groups with lower 
SES to weigh more heavily than groups with higher SES. Section 3.3 shows 
how the rank-dependent RI can be derived from a rank-dependent social 
welfare function, relating the proposed index to the Makdissi-Yazbeck two- 
parameter classes of health achievement and inequality indices [Makdissi and 
Yazbeck (2012)]; in turn, those extend the corresponding Wagstaff classes 
of indices [Wagstaff (2002)], reviewed in Section 3.1. (In Appendix A, a 
“convenient regression” relates the rank-dependent RI to the slope index of 
inequality.) In Section 3.4, the GE class of indices is modihed for rank depen¬ 
dence (and reference invariance). Simulation results in Section 4.1 provide 
empirical evidence that the rank-dependent RI is more robust than either 
of its Makdissi-Yazbeck or GE-based counterparts to changes in the distri¬ 
butions of SES or health outcomes. 

Sections 4.2 and 4.3 illustrate the proposed methodology using data from 
the U.S. National Health and Nutrition Examination Survey (NHANES), 
CDG, NCHS, as well as data from the U.S. Surveillance, Epidemiology 
and End Results (SEER) Program, NIH, NCI. Such health survey and reg¬ 
istry data are common for tracking population-based HP2020 objectives. 
The standard error and sampling distribution of the rank-dependent RI 
for these data are evaluated using linearization and resampling techniques. 
Even though progress has been made in understanding the asymptotic be¬ 
havior of health inequality indices [Cowell, Davidson and Elachaire (2011), 
Aaberge (2005)] and first-order linearization can be adapted for evaluating 
the sampling variability of such indices [see Appendix B, and Langel and 
Tille (2013), Borrell and Talih (2011, 2012), Biewen and Jenkins (2006), and 
Kakwani, Wagstaff and van Doorslaer (1997)], resampling methods remain 
most useful for evaluating statistical significance, especially with complex 
survey data [Talih (2013b), Chen, Roy and Crawford (2012), Cheng, Han 
and Gansky (2008), Harper et al. (2008), Rao, Wu and Yue (1992), Rao and 
Wu (1988)]. 

2. Renyi index. Eor a population that is partitioned into M mutually 
exclusive groups of sizes ni, n 2 ,..., um, with n = YlfLi aiid nj > 0 for j = 
1,2,..., M, consider the distribution of a particular adverse health outcome 
yij for individual i in group j. Bindings of health disparities between groups 
rest on the comparison of the aggregate health outcomes y.j = 

J = 1,2,..., M, either to one another or to the total, y.. = y-j- Below, 
y.. is assumed to be positive (i.e., the outcome of interest is observed) and 
the average adverse health outcomes for the groups and the total population 
are denoted y.j = y.jjnj and y.. = y../n, respectively. 
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Definition. Let relative health disparities rj be proportional to the groups’ 
average adverse health outcomes: rj oc y.j. For any positive group weights pj, 
define pj =Pj/YlikPk fj = Vj/YhkPkfk- The Renyi index, which takes 
values in [0,+oo], is given by 


( 2 . 1 ) 



M 

-^Pjlnfj, 

i=i 


for a 7 ^ 1, a > 0, 


for a = 1. 


Thus, RIq, = 0 if y.j = YlkPkV-k- The expression in (2.1) is that of the Renyi 
divergence between the two probability mass functions pj and pj := pjfj 
[Talih (2013b), Renyi (1961)]. 

Remarks. The pj are positive weights that are assigned to each group. 
Groups are equally weighted {pj = 1/M), population weighted {pj = njjn) 
or, otherwise, reflect a preference ordering, such as the socioeconomic weights 
of Section 3. The Vj are relative health disparities, where the reference is 
the population average {vj = y.jjy..), the least adverse health outcome {rj = 
y.j / minfc y.^) or, otherwise, any fixed reference such as a HP2020 target {rj = 
y.j/yturget)- Due to the scale invariance of the RI, the rj need only be pro¬ 
portional to the groups’ average adverse health outcomes y.j [Talih (2013b)] . 

When Pj = nj/n, the standardized Renyi index, with values in [0,1], is 
the (between-group) Atkinson index [Atkinson (1970)], obtained from 

(2.2) Aa = l-e“^D^ 


The RI increases with a. With infinite inequality aversion a —>■ oo, the RI 
is dominated by the population group with the least adverse health outcome: 


lim RIq, = — ln( min rt ) =: RIoo • 

a—>-cxD \l<k<M J 

Because 0 < Aq < Aqo < 1, an alternative standardization to that in (2.2) 
emerges: 

Aq 1 — e“ 

Aoo 1 - e“ ' 

Some of the most commonly used (between-group) health inequality in¬ 
dices belong to the generalized entropy (GE) class, with values in [0,-|-oo], 

M 

GEq = ^^Pj9a{'^j)i 
1=1 
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where pj = rijjn, rj = y.j/y.., and 


(2.3) 


9a{r) 



— -, a7^1,a>0, 

1 — a 

— Inr, a = 1; 


see Talih (2013b) and the literature review therein. When a = 1, the Renyi 
and GE indices are equal. When a / l,a > 0, theses indices are related as 
follows: 


--ln[l — (1 — a) GEq,]. 

1 — a 


(2.4) 


RIq, 


An important result from Talih (2013b) is that RI^ < GE^ for a > 1, 
which entails that, for a > 1, the RI is more robust than the GE index 
to changes in the distribution of health outcomes. For example, consider 
the hypothetical populations in Table 1, which are studied in Section 4.1 
below. With the commonly used parameter value a = 2, the RI increases 
35% between populations 1 and 3, from 0.257 to 0.348, whereas the GE 
increases 42%, from 0.293 to 0.417, 1.2 times the rate of increase of the RI. 
With a = 4, the RI increases 26% between populations 1 and 3, from 0.567 to 
0.717, whereas the GE increases 70%, from 1.494 to 2.534, over 2.6 times the 
rate of increase of the RI. Figure 2 further illustrates the lack of robustness of 
the rank-dependent GE compared with the rank-dependent RI for a range 
of parameter values. Robustness is especially important for less common 
adverse health outcomes because even small absolute differences between 
groups can translate into very large relative disparities rj and, therefore, 
large index values. Harper et al. (2010) provide an excellent outline of the 
debate regarding absolute versus relative disparities. 

3. Rank dependence and differential weighting. The crucial difference 
between a rank-dependent health disparity index and a health disparity 
index that is not rank dependent is that the former accounts for the as¬ 
sociation between an exposure (e.g., SES) and an outcome (e.g., late-stage 
uterine cervical cancer), whereas the latter accounts only for inequalities in 
the outcome variable. 

Let the population groups be ranked from lowest to highest SES, with 
rij >0. For j = 1,..., M — 1, define rank variables Rj as follows: 



k=l 


By construction, 0 < Rj < Rj^i < 1. For scalar u>l, define 


w^{Rj) = u{l - RjY-\ 
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The rank-dependent Renyi index proposed in this paper is derived from 

(2.1) using the socioeconomic weights = Wy{Rj)pj instead of just pj, as 

seen in Section 3.2 below. For z/ > 1, the initial weights p^^ =Pj rescaled 
according to the rank of each group: groups with lower SES are weighted 
more heavily. In particular, 

(3.2) Wu{Rj)=i^ r J ^Wy{l-Rj). 

For example, suppose groups are equally weighted to start, that is, pj = 
1/M. Then, for u = 2, the socioeconomic weight for a group at the first 
quintile of the SES distribution (i.e., with Rj = 0.20) would be 4 times the 
socioeconomic weight for the corresponding group at the fourth quintile of 
the SES distribution. With = 3, this factor grows to 4'^“^ = 4^ = 16. When 
groups are population weighted initially, that is, pj =njjn^ the effect of 
increasing the value of the parameter v is not as clear cut. Still, Figure 1 
shows, for example, that moving from u = 1 to u = 2> triples the relative 
weight of the “poor” and more than doubles the relative weight of the “near 
poor,” while rendering the weight on the “high income” group negligibly 
small (these groups are defined in Table 1). The selection of the parameter 
z/, in practice, will vary according to the context and data. The analyst is 
advised to explore different scenarios and, if required, select the parameter 
v that most closely reflect his/her expectation. 

As seen next, the slope index of inequality, the classical concentration in¬ 
dex, and the extended concentration and health achievement indices all uti¬ 
lize such differential SES weighting as in (3.2), either implicitly or explicitly. 

3.1. Concentration and health achievement indices. Consider the Q-Q 
plot of the cumulative distribution of health burden y.j against the SES 
rank variables Rj dehned previously. The classical concentration index is 
dehned as twice the area between the resulting Q-Q curve and the diagonal; 
equivalently, it can be written as twice the covariance between SES rank 
and health burden, which directly relates it to the slope index of inequality; 
see Wagstaff, Paci and van Doorslaer (1991), Pamuk (1985, 1988), as well 
as Appendix A. 

With Pj = nj/n, rj = y.j/y.., and a normalizing constant W = X^j(l — 
Rj)pj, the classical concentration index, with values in [—1,-|-1], can be 
written as 

2 ^ 
i=i 

The index C takes the value 0 when the aforementioned Q-Q curve coincides 
with the diagonal (i.e., when the covariance between SES rank and health 
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Even though the concentration index C initially appears value neutral, 
this latest expression reveals that C is value laden: all else being equal, the 
relative disparities rj for groups with lower SES (i.e., lower rank Rj) are 
weighted more heavily than those for groups with higher SES; specifically, 
C uses V = 2 vcL (3.2). 

To enable the analyst to account more explicitly for such a value judgment 
with respect to the differential weighting of the groups, Wagstaff (2002) 
introduced the extended concentration index, defined for u>\ as 


M 



with normalizing constant W{v) = X]j(l ~ previously, increas¬ 

ing the value of v results in increasingly larger weights placed on the groups 
with lower SES, whereas groups with higher SES are assigned increasingly 
smaller weights. Thus, the parameter v reflects a degree of socioeconomic 
inequality aversion. 

Between-group disparities, as well as the socioeconomic weighting of the 
groups, are sensitive to the implicit (or explicit) value judgments underlying 
the classical (or extended) concentration index. Moreover, assessing disparity 
based solely on an average health burden fails to account for that burden’s 
association with SES and the extent of inequality between the lower and 
higher SES groups. Wagstaff (2002) introduced a (rank-dependent) health 
achievement index to quantify this trade-off between improving population 
health and reducing health inequality. The Wagstaff health achievement in¬ 
dex is defined for u>l as 


M 



With pj = rij/n and Vj = y.jiy.., H{1) = y.., the population average, and 
the concentration and health achievement indices are related as follows: 


(3.3) 


H{u) = [l-C{v)]xy... 


As before, consider a particular adverse health outcome yijk for individual 
i in SES group j and population k, for example, late-stage uterine cervical 
cancer by SES within racial/ethnic population groups in the U.S. If SES was 
not accounted for [e.g., = 1 and (7(1) = 0], then only the population means 

would be compared; for example, the mean y..i for population 1 might be 
higher than the mean y ..2 for population 2, signifying a higher cancer burden 
for population 1 than for population 2 (e.g., 9.0 versus 6.4 per 100,000). On 
the other hand, if SES was accounted for [e.g., u > 1 and [(7(2^)| >0], and it 
was ascertained that the two populations had the same value of H{v) (e.g.. 
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8.64 per 100,000 for both populations), then this could occur because, say, 
population 1 had a more equal distribution across SES groups [Q-Q curve 
closer to the diagonal, e.g., C{u) = 0.04], whereas population 2 had a higher 
burden of disease for the lower SES groups [Q-Q curve farther from the 
diagonal, e.g., C(i^) = —0.35]. 

Similarly for comparisons over time for a single population, the mean y.. 
could remain unchanged, yet the health achievement could become worse due 
to a shift in the SES distribution of health burden. Incidentally, precisely for 
this reason, Chen et al. (2013) caution against causal inference from socioe¬ 
conomic health inequality indices such as the slope index of inequality or the 
concentration index. Nonetheless, such indices remain useful for descriptive 
as well as comparative analyses in large indicator initiatives, where resource 
limitations do not always permit in-depth causal analyses; the HP2020 ini¬ 
tiative, for example, houses over 1200 health indicators [DHHS (2014)]. 


3.2. Rank-dependent Renyi index. As stated previously, the rank-depen- 
dent RI is derived from (2.1) using the socioeconomic weights = Wy{Rj)pj 
To better highlight its connection to social evaluation functions in Sec¬ 
tion 3.3, we introduce appropriate notation here, and re-express the rank- 
dependent RI accordingly. In addition, to simplify the remainder of this pa¬ 
per, the pj will, henceforth, denote the population-weighted group weights 
Pj = njjn. However, identical derivations follow for equally-weighted groups 

as well as any other group weights as a starting point p^p. 

Notation. For r > 0, let fa denote the power transform and f~^ its inverse: 

, ^ [[(1a/l,a>0, 

I e^, a = 1. 

For a > 0, the function fa is the generalized logarithm. Define 
M M 

Wi{v) = '^w^{Rj)pj, W2{v) = '^Wy{Rjfpj, 

i=i i=i 


(3.4) fa{r) = 


rpl-Ol 

1 — a 
Inr, 


Wy{Rj) 


Wu{Rj) 
Wi{u) ’ 


Let pf^ 
have 


M 

S{n,a) = '^w^{Rj)pjfa{rj). 
i=i 

and = f'j/YlkPk''’'’"^- Using this notation, we 

_(l/) - , jy S 1 -{U) '^j 

p. =w,{R,)p, and r. = 
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and the rank-dependent Renyi index from (2.1) is expressed for all a > 0 
and z/ > 1 as 


(3.5) 



5 ( 1 ., 0 ) /• 


3.3. Rank-dependent social evaluation function. A two-parameter social 
evaluation function is given in aggregate form by 

M 

(3.6) 5*(z.,a) = 'Y^Wu{Rj)Pjfa{y-j), 

i=i 

where fa{y.j),ot > 0, represents society’s evaluation of the group’s health 
burden y.j and Wy{Rj)pj = is the group’s socioeconomic weight [Makdissi 
and Yazbeck (2012)]. 

Remark. The asterisk in S*{v,a) is to distinguish it from the relative 
measure S{iy,a) defined previously, where the social evaluation function /„ 
was evaluated at the relative disparities rj instead of the average health 
outcomes y.j. 

In the above, two components of societal evaluation of health are fea¬ 
tured: 


(i) A pure health inequality component, driven by society’s evaluation 
of a group’s health burden irrespective of its SES rank—the function /„ in 
(3.4) has constant relative-inequality aversion ot =—yf'f{y)/f'^{y) [Cowell 
and Gardiner (1999), Pratt (1964)]. In particular, v = 1 results in a social 
preference function that is indifferent to SES (at least explicitly, since, di¬ 
rectly or indirectly, SES remains a determinant of health). 

(ii) A socioeconomic health inequality component, driven by the rank- 
dependent weighting function Wi,{Rj )—the parameter i. is a socioeconomic 
health inequality aversion parameter, with hyperbolic absolute-inequality 
aversion {v — 2)/{l — R) = —w'l{R)/w'^{R) when v > 2. Here, a = 0 results 
in a social preference function that is indifferent to pure health inequalities 
(again, at least explicitly), quantifying solely the distribution of the adverse 
health outcome along the SES gradient, as in the extended concentration 
index of Wagstaff (2002). 

A rank-dependent health achievement index is obtained from H* = fa^{S*) 
in (3.6); it represents an equally-distributed equivalent level of health such 
that S* is equivalent to fa{H*) —that is, a hypothetical society in which 
all groups achieve an average outcome y.j equal to H*. The Makdissi and 
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Yazbeck (2012) health achievement index is expressed as 


(3.7) H*{v,a) = < 


M 




-i—O 


lj=i 


l/(l-a) 


exp 


M 


'^Wy{Rj)pj\ny.j 

lj=i 


for a 7 ^ 1, a > 0, 


for a = 1. 


For example, 77*(1,0) is the population average outcome '^^iPjy-j = y- 
(when pj = rij/n), whereas i7*(z^, 0) is the SES-weighted population average 
outcome addition, the two limiting cases a —)• oo and —)• oo 

are important for interpretation: 


(3.8) 


H*(u,oo) := lim H*{v,a)= min y.h and 
^ a->oo ^ l<k<M^ 


H*[oo,a) := lim H*{u,a) = y.k*, 

U^OO 


where k* = argmmi^kKM Rk is the group with the lowest SES rank. 

As 1 / > 1 increases, more weight is given to the group with the lowest 
SES. If the SES gradient in health is positive when groups are ranked from 
highest to lowest SES, then the group with the lowest SES will also have the 
worst health outcome y.k* ^max^y.^. Thus, when v^oo, society’s health 
achievement becomes only as good as that of its socioeconomically most 
disadvantaged [Rawls (1999)]. On the other hand, holding the parameter u 
constant, health achievement can only be improved at a progressively steeper 
cost of nonintervention, as reflected by increasing a > 0. In a society that is 
inhnitely averse to inequality (and that has unlimited resources), all groups 
achieve the best group rate H*{v,oo) = miufci/.fc. 

The rank-dependent RI in (3.5) provides a unified mathematical frame¬ 
work for engaging in the aforementioned considerations. The index Rli^^ 
and the standardized index are 


(3.9) 



a) 


and = 1 — 


H*{v, a) 
H*{v, 0) ’ 


In other words, what equations (3.9), as well as Eigure 1 below, show is 
that, for each given value of zz > 1, the standardized rank-dependent RI 
expresses the relative change that would be required to “move the needle” 
from the status quo [e.g., the reference achievement level 77*(zz, 0), an SES- 
weighted population average health burden] to a level of health achievement 
that is compatible with societal expectations [achievement level H*{v,a) for 
aversion parameter value a\. 
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Two-parameter extended concentration index. As in (3.3), when 
Pj =nj/n, the two-parameter extended concentration index [Makdissi and 
Yazbeck (2012)] 


(3.10) 


C{u,a) = 1 — 


y.. 


compares the requisite equally-distributed equivalent health level H*{u, a) to 
the population average health outcome y... C{iy,a) corresponds to the stan¬ 
dardized index Aa'^ that would be obtained if one used = y.j / YlkPkV-k = 
y.j/y- instead of = y-j/YlkP^k^V-k (2-1). However, unlike the stan¬ 
dardized index Aa'^ in (3.9), C{v,a) does not remain nonnegative. C{v,tS) 
and C(2,0) are the extended [(^(zz)] and classical (C) health concentra¬ 
tion indices, respectively; see Section 3.1. Instead of the population average 
outcome y.. = as reference for health achievement, the standardized 

index Aa^ in (3.9) uses the SES-weighted average The relationship 

between the standardized rank-dependent RI, the two-parameter extended 
concentration index and the extended concentration index is as follows; 


1 - aM 


1 — C(i^, a) 

1-C{u, 0)’ 


Achievement versus capacity to achieve. As noted in Section 2, the stan¬ 
dardization in (2.2) is not fully satisfactory in that A^ —?> 1 only if RR ^ oo. 
Thus, for the rank-dependent RI, the following standardization may be 
preferable: 

- H*{u,oo)' 

Holding the parameter u constant, Aa'^/A^ is the proportion of the max¬ 
imum potential improvement in health achievement — H*{i',oo)] 

that would be attained at nonintervention cost a > 0 if all groups were to 
achieve y.j = H*{v,a) instead of only y.j = H*{iy,0). 

Figure 1 illustrates the notion of health achievement relative to the pop¬ 
ulation’s “capacity to achieve” using data for hypothetical population 1 
in Table 1, with socioeconomic health inequality parameter = 1 (rank- 
neutral group weights) and zz = 3 (weights favorable to groups with low 
income level). The reference “achievement” level that is, the income- 

weighted population proportion in fair or poor health, is higher for larger 
zz, resulting in a larger gap relative to the best rate H*{v,oo). A larger v 
results in a larger a —that is, a higher “cost of nonintervention”—for about 
the same achievement level H[u,a) ~ 8%. 
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Adverse health status by income level 
(hypothetical population 1) 

□ Excess with respect to (w.r.t.) reference level (H(v. 0)) 

□ Excess w.r.t. desired Ie\’el (H(v. a)) 
but below reference level (H(v, O)) 
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Fig. 1. Achievement versus capacity to achieve: Illustration using data for hypothetical 
population 1 in Table 1, with socioeconomic health inequality parameter v = l (rank-neu¬ 
tral group weights; top panel) and u = 3 (weights favorable to groups with low income 
level; bottom panel). The reference “achievement” level H{v,Q) (solid lines), that is, the 
income-weighted population proportion in fair or poor health, is higher for larger v, re¬ 
sulting in a larger gap relative to the best rate. A larger v results in a larger a—that is, 
a higher “cost of nonintervention”—for about the same achievement level H{v,a) « 8% 
(dashed lines). 
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Table 1 

Percentages in fair or poor health by income level for three hypothetical populations'^ 


Group (j) 

Poor 

Near poor 

Middle income 

High income 

Population 1 

Proportion of population (nj /n) 

0.05 

0.15 

0.60 

0.20 

Percent in fair or poor health ijj-j) 

30% 

20% 

15% 

5% 

Population 2 

Proportion of population (uj jn) 

0.05 

0.15 

0.60 

0.20 

Percent in fair or poor health (y.j) 

30% 

20% 

5% 

15% 

Population 3 

Proportion of population (nj jn) 

0.20 

0.20 

0.40 

0.20 

Percent in fair or poor health (y.j) 

30% 

20% 

15% 

5% 


“Income level is expressed as a percent of the poverty threshold. Here, poor = below 100%, 
near poor = 100-199%, middle income = 200-399%, and high income = at or above 400% 
of the poverty threshold. 


3.4. Rank-dependent generalized entropy class. As stated earlier, the GE 
index (2.3) also can be modified for rank dependence. Originating in the 
study of likelihood ratio tests [Chernoff (1952)], the GE class is tied to im¬ 
portant axiomatic properties in inequality measurement [Cowell and Kuga 
(1981)1 and remains widely used in the economic analysis of income inequal¬ 
ities; see Talih (2013b) for a review of relevant literature. 

Definition. As before, let the relative health disparities rj be proportional 
to the groups’ average adverse health outcomes: rj ocy.j. Eor any positive 
group weights pj, define pj = Pj/ YlkPk fj = rj/'^y.pk'f'k- A reference- 
invariant GE index is given by 


(3.12) 


GE„ = 


M 

-^Pjlnfj, 

i=i 


for a 7 ^ 1 , a > 0 , 


for a = l. 


As before, a rank-dependent reference-invariant GE index is derived from 
(3.12) using the socioeconomic weights pj^^ = Wy{Rj)pj instead of pj. Using 
the previous notation, the rank-dependent reference-invariant GE index is 
expressed for all z/ > 1 and a 7 ^ 1 , a > 0 , as 


(3.13) 


^(^,0) j j- 


When a = 1, GE^^^ = For a 7 ^ 1, the rank-dependent GE index is 

obtained from the (standardized) rank-dependent RI as follows, similarly 
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to (2.4): 

geM = ^{1 - [1 - 

As noted earlier, an important result from Talih (2013b) is that, for > 1 
and a > 1, < The inequality is reversed for 0 < a < 1. Thus, 

the rank-dependent RI is more conservative and, therefore, more robust to 
changes in the distribution of either SES or health burden than its GE-based 
counterpart for a > 1. 

4. Empirical findings. 

4.1. Simulation studies. I compare the rank-dependent RI (3.5) with 
the Makdissi-Yazbeck concentration index (3.10) and the rank-dependent 
reference-invariant GE index (3.13) for hypothetical populations studied by 
Keppel et al. (2005); see Table 1. 

In Eigure 2, the rank-dependent reference-invariant GE index (top row) 
is standardized as in (2.2), so that it takes values in [0,1]. In addition, 
because the Makdissi-Yazbeck index in (3.10) may be negative, only its 
absolute value is plotted (bottom row). For = 1, the rank-dependent RI 
and the Makdissi-Yazbeck index are equal; therefore, only i' = 2 and z^ = 3 
are shown. By construction, the rank-dependent RI and rank-dependent 
reference-invariant GE index are equal to 0 for a = 0; the Makdissi-Yazbeck 
index is not. Conversely, the latter may be zero for positive values of a, 
whereas the RI and GE index remain strictly positive unless fj = 1 for all 
j. Shown in the bottom row of Figure 2 with a = 0, the class C{u,0) is 
the Wagstaff class C{u) of extended concentration indices and (7(2,0) is the 
classical health concentration index (7; see Section 3.1. 

The relative ranking of the three hypothetical populations in Table 1 
changes when the parameters of either of the three indices displayed in 
Figure 2 are modified. For example, for the Wagstaff class (7(z^, 0), setting 
V = 2 yields the ranking 2 < 1 < 3 of the three populations from lowest to 
highest inequality, whereas v = 3 results in the ranking 1 < 2 < 3. (Setting 
z^ = 4, not shown, results in the ranking 1 < 3 < 2.) The Makdissi-Yazbeck 
index (7(z^, a) further suffers from lack of smoothness as the pure health in¬ 
equality aversion parameter a increases, with inequality in some populations 
assessed to be zero even for larger values of a. This results in yet further 
permutations of the relative ranking of the three hypothetical populations 
considered. In contrast, both the rank-dependent RI and rank-dependent 
reference-invariant GE index remain smooth functions of the parameter a. 
Even though the relative rankings resulting from the use of either of those 
two classes of indices are usually in agreement, the rank-dependent RI is 
more conservative than its GE-based counterpart for all values of a > 1, 
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Fig. 2. Comparison of the rank-dependent Renyi index [(3.5) and middle row] with the Makdissi-Yazbek concentration index [(3.10) 
and bottom row] and the rank-dependent reference-invariant GE index [(3.13) and top row] for hypothetical populations in Table 1. For 
0 = 1, the rank-dependent Renyi index and the Makdissi-Yazbek concentration index are equal; therefore, only u = 2 (two left columns) 
and v — 3 (two right columns) are shown here. By construction, the rank-dependent Renyi index and rank-dependent reference-invariant 
GE index are equal to 0 for a = 0; the Makdissi-Yazbek index is not. The class C{v,0) is the Wagstaff class of extended concentration 
indices. For n — 2, the index (7(2,0) is the “classical” health concentration index. 
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as known from the inequality at the end of Section 3.4. As a result, the 
rankings induced from those two classes of indices may differ, especially for 
larger values of a. In addition, the rank-dependent RI is less affected than 
its GE-based counterpart by changes to either the health (population 1 vs. 
population 2) or income (population 1 vs. population 3) distributions. 

4.2. NHANES case study. During the past 20 years, there was an in¬ 
crease in obesity in the U.S. Although rates have leveled off in recent years, 
they remain at historically high levels. Between 1988-1994 and 2009-2010, 
the obesity rate increased 69% among children and adolescents aged 2-19 
years, from 10.0% to 16.9% [Ogden et al. (2012)]. 

Low income children and adolescents are more likely to be obese than 
their higher income counterparts [Ogden et al. (2010)]. In 2009-2010, those 
with family incomes at or above 500% of the poverty threshold had the 
lowest obesity rate, 11.5% (Table 2). Rates that differed signihcantly from 
the lowest rate at the 0.05 level of significance for children and adolescents 
with lower family incomes were as follows: 21.6% for those under the poverty 
threshold, nearly twice the lowest rate; 17.4% for those with family incomes 
at 100-199% of the poverty threshold, about one and a half times the lowest 
rate; and 15.7% for those with family incomes at 200-399% of the poverty 
threshold, almost one and a half times the lowest rate. 

The rank variables Rj are computed according to (3.1) and shown in 
Table 2. Figure 3 displays the estimated rank-dependent RI together with 
its bootstrapped 95% confidence interval (using B = 1000 bootstrap sam¬ 
ples) for the prevalence of obesity among children and adolescents by family 
income, for NHANES 2009-2010 and the combined cycles 2001-2004 and 
2005-2008. For illustration, values of the socioeconomic health inequality 
parameter shown in Figure 3 are u = 1 (rank-neutral group weights; top 
panel) and v = 3 (weights favorable to those with low family income; bot¬ 
tom panel). Values of the pure health inequality aversion parameter shown 
are a = 0.5,1,2,4 and 8. With = 3, a slight increase in the rank-dependent 
RI over time is observed, irrespective of a. However, the relative ranking 
of the three survey periods changes with u, as observed in the simulation 
studies of Section 4.1 as well as in Figure 3 for v = 1. Furthermore, for all 
combinations of v and a shown, none of the observed differences in the rank- 
dependent RI between survey periods are statistically significant at the 0.05 
level of significance. 

Notes. Obesity for children and adolescents is defined as body mass index 
(BMI) at or above the sex- and age-specific 95th percentile from the 2000 
CDC Growth Gharts for the U.S. [Troiano and Flegal (1998), Kuczmarski 
et al. (2002)]. HP2020 objective NWS-10.4 tracks the proportion of chil¬ 
dren and adolescents aged 2-19 years who are considered obese. Data for 
NWS-10.4 are from the National Health and Nutrition Examination Survey 
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Table 2 

Prevalence of obesity in children and adolescents (aged 2-19 years) by family income, 

2001-201(r’^ 


Income category 

(Family income expressed as 1 2 3 4 5 

percent of poverty threshold) (< 100%) (100-199%) (200-399%) (400-499%) (> 500%) 


NHANES 2001-2004 


Prevalence (%) in group {y.j) 

17.9 

16.7 

17.8 

13.1 

9.8 

Standard error (%)‘* 

1.295 

1.249 

1.182 

1.691 

1.600 

Population in group {pjY 

0.241 

0.242 

0.291 

0.095 

0.132 

Rank (Rj)^ 

0.120 

0.362 

0.628 

0.821 

0.934 

NHANES 2005-2008 

Prevalence (%) in group (y.j) 

19.9 

18.2 

16.0 

14.3 

9.8 

Standard error (%) 

1.368 

1.447 

1.403 

2.747 

1.838 

Population in group (pj) 

0.218 

0.223 

0.298 

0.099 

0.162 

Rank (Rj) 

0.109 

0.329 

0.590 

0.788 

0.919 

NHANES 2009-2010*’ 

Prevalence (%) in group (y.j) 

21.6 

17.4 

15.7 

14.2 

11.5 

Standard error (%) 

1.306 

1.428 

1.437 

2.686 

2.591 

Population in group (pj) 

0.232 

0.235 

0.274 

0.088 

0.171 

Rank (Rj) 

0.116 

0.349 

0.604 

0.785 

0.914 


“Obesity for children and adolescents aged 2-19 years is defined as body mass index (BMI) 
at or above the sex- and age-specific 95th percentile from the 2000 CDC Growth Charts 
for the U.S. [Troiano and Flegal (1998), Kuczmarski et al. (2002)]. 

*’Data are available biennially and come from the National Health and Nutrition Ex¬ 
amination Survey (NHANES), CDC, NCHS. Preferably four years of data are pooled for 
analysis when available [Johnson et al. (2013)], but two-year data are used as a placeholder 
to provide the latest data available. 

“Family income is expressed as a percent of the poverty threshold; missing values are not 
included in the analysis. 

'^Standard error evaluated by Taylor linearization [RTI (2012), SAS Institute (2010)]. 
“Proportions are rounded for table display and may not add up to exactly 1.000; unrounded 
values are used in all calculations. 

^Rank of group in cumulative distribution of population computed according to (3.1). 


(NHANES), CDC, NCHS. Preferably four years of data, for example, 2009- 
2012, are pooled [Johnson et al. (2013)], however, at the time of writing this 
paper, only the two-year data for 2009-2010 were available for analysis. The 
derivation of a Taylor linearization approximation of the standard error of 
the rank-dependent RI for the various combinations of the parameters v and 
a is presented in Appendix B.l; those standard errors are used in signifi¬ 
cance testing for the differences in the rank-dependent RI between NHANES 
2001-2004, 2005-2008 and 2009-2010. The approximate 95% confidence in¬ 
tervals shown in Eigure 3 are based on the rescaled bootstrap, which allows 
the examination of the sampling distribution of quantities such as the rank- 
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Obesity in children and adolescents by family income, NHANES 2001-2010 
Rank-dependent Renyi index and bootstrapped 95% confidence interval 

a NHANES 2001-2004 
« 1 b NHANES 2005-2008 
c NHANES 2009-2010 
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Obesity in children and adolescents by family income, NHANES 2001-2010 
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Fig. 3. Rank-dependent R1 and its bootstrapped 95% confidence intervals (B = 1000^ for 
the prevalence of obesity among children and adolescents aged 2-19 years by family income, 
from NHANES 2001-2010 (data in Table 2). For illustration, values of the socioeconomic 
health inequality parameter shown are (rank-neutral group weights; top panel) and 

u — 3 (weights favorable to those with low family income; bottom panel). Values of the pure 
health inequality aversion parameter shown along the x-axis are a = 0.5,1,2,4 and 8. For 
all combinations of u and a shown, observed differences between the three survey periods 
in the rank-dependent RI are not statistically significant. 
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dependent RI in complex survey data without relying on normality or other 
distributional assumptions [Talih (2013b), Cheng, Han and Gansky (2008), 
Rao, Wu and Yue (1992), Rao and Wu (1988)]. 

4.3. SEER case study. Even though incidence and death rates have de¬ 
clined in recent years for all cancers, cancer remains a leading cause of 
death in the U.S., second only to heart disease. The cancer objectives for 
HP2020 underscore the importance of the following: promoting evidence- 
based screening for cervical, colorectal and breast cancer in accordance with 
U.S. Preventive Services Task Force recommendations; and monitoring the 
incidence of invasive (cervical and colorectal) cancer and late-stage breast 
cancer, which are intermediate markers of cancer screening success [DHHS 
(2014)]. 

For this case study, I examine a subset of the data used to monitor HP2020 
objective C-10, to reduce invasive uterine cervical cancer. Incidence and 
treatment of cervical cancer show disparities by race and ethnicity, SES and 
health care access [Saraiya et al. (2013), Akers, Newman and Smith (2007)]. 
The data in Table 3 are from the Surveillance, Epidemiology, and End Re¬ 
sults (SEER) Program’s 18 Regs Research Data, NIH, NCI [Surveillance 
Research Program (2013), SEER Program (2013), Young et al. (2001)], and 
may not be nationally representative for the U.S.; see notes below. Nonethe¬ 
less, for the cases included in Table 3, counties where the proportion of 
persons below the poverty threshold was lowest (0.00-8.91%) had the lowest 
incidence of invasive uterine cervical cancer, 6.2 cases per 100,000 popula¬ 
tion (age adjusted). Rates that differed signihcantly from the lowest rate 
at the 0.05 level of significance for counties with lower area-level SES were 
as follows: 8.7 per 100,000 for counties with the highest proportion (18.87- 
56.92%) of persons below the poverty threshold, nearly one and a half times 
the lowest rate; 8.0 per 100,000 for counties with the second highest propor¬ 
tion (14.53-18.86%) of persons below the poverty threshold, nearly one and 
a half times the lowest rate; and 7.4 per 100,000 for counties with the third 
highest proportion (11.61-14.52%) of persons below the poverty threshold, 
19% higher than the lowest rate. 

The rank variables Rj are computed according to (3.1) and shown in Ta¬ 
ble 3. Figure 4 displays the estimated rank-dependent RI and the boxplot 
for its bootstrapped sampling distribution (using B = 1000 bootstrap sam¬ 
ples) under the null hypothesis of independence for the incidence of invasive 
uterine cervical cancer by area SES, 2006-2010, from the SEER 18 Regs 
Research Data. For illustration, values of the socioeconomic health inequal¬ 
ity parameter shown in Figure 4 are 1 ^ = 1 (rank-neutral group weights; top 
panel) and = 3 (weights favorable to groups with low area SES; bottom 
panel). Values of the pure health inequality aversion parameter shown are 
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Table 3 


Incidence of invasive uterine cervical cancer (age adjusted, per 100,000) by area SES, 

SEER 2006-20ICr 


County quintile group (j)*’ 

(Percentage of persons below 5 4 3 2 1 

poverty threshold in county) (18.87-56.92%) (14.53-18.86%) (11.61-14.52%) (8.92-11.60%) (0.00-8.91%) 

2006 


Incidence in group 

9.6 

8.9 

7.5 

7.5 

6.4 

Standard error‘s 

0.484 

0.285 

0.339 

0.314 

0.226 

Population in group (pj)^ 

0.104 

0.267 

0.159 

0.179 

0.292 

Rank {Rj)^ 

0.052 

0.237 

0.450 

0.619 

0.854 

2007 

Incidence in group (y.j) 

9.0 

9.0 

8.1 

6.9 

6.5 

Standard error 

0.468 

0.285 

0.353 

0.298 

0.227 

Population in group (pj) 

0.104 

0.265 

0.160 

0.179 

0.292 

Rank {Rj) 

0.052 

0.237 

0.449 

0.618 

0.854 

2008 

Incidence in group (y.j) 

9.7 

9.0 

8.0 

7.3 

6.2 

Standard error 

0.478 

0.285 

0.346 

0.307 

0.220 

Population in group (pj) 

0.104 

0.263 

0.161 

0.179 

0.293 

Rank (Rj) 

0.052 

0.236 

0.448 

0.618 

0.854 

2009 

Incidence in group (y.j) 

8.5 

8.4 

7.0 

7.7 

6.4 

Standard error 

0.450 

0.274 

0.324 

0.316 

0.223 

Population in group (pj) 

0.104 

0.262 

0.161 

0.179 

0.293 

Rank (Rj) 

0.052 

0.236 

0.447 

0.617 

0.853 

2010 

Incidence in group (y.j) 

8.7 

8.0 

7.4 

6.4 

6.2 

Standard error 

0.453 

0.268 

0.330 

0.286 

0.217 

Population in group (pj) 

0.104 

0.261 

0.162 

0.179 

0.293 

Rank (Rj) 

0.052 

0.235 

0.447 

0.617 

0.853 


“Data are from the SEER 18 Regs Research Data, NIH, NCI [SEER Program (2013), 
Young et al. (2001)], and are age adjusted using the year 2000 U.S. standard population. 
Data shown here do not include the full set of registries used in HP2020 to track objective 
C-10; thus, data may not be nationally representative. 

*’Area socioeconomic status (SES) is computed using county-level data for the 3141 coun¬ 
ties in the year 2000 U.S. Census. Cutpoints for each SES group, displayed in the table 
header row, correspond to the county quintiles when these are sorted according to the 
percentage of persons living below the poverty threshold in the county. 

“New cases of invasive uterine cervical cancer (age adjusted) per 100,000 population in 
group. 

'^Standard error evaluated by Taylor linearization [Surveillance Research Program (2013)]. 
“Proportions are rounded for table display and may not add up to exactly 1.000; unrounded 
values are used in all calculations. 

■^Rank of group in cumulative distribution of population computed according to (3.1). 


a = 1,2 and 4. For all combinations of v and a shown, the observed rank- 
dependent RI differs significantly from its expected value under the null 
hypothesis, indicating that the latter can be rejected. However, for all com- 
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Invasive uterine cervical cancer by area SES, SEER 2006-2010 
Rank-dependent Renyi index and boxplots of its null distribution for selected parameter combinations 
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Invasive uterine cervical cancer by area SES, SEER 2006-2010 
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Fig. 4. Rank-dependent RI and boxplots of its bootstrapped sampling distribution 
(B = 1000 ) under the null hypothesis of independence for the incidence of invasive uterine 
cervical cancer by area SES, 2006-2010, from the SEER Program’s 18 Regs Research Data 
(data in Table 3). For illustration, values of the socioeconomic health inequality parameter 
shown are v =1 (rank-neutral group weights; top panel) and v = (weights favorable to 
groups with low area SES; bottom panel). Values of the pure health inequality aversion 
parameter shown are a = 1,2 and 4. For all combinations of o and a shown, the observed 
rank-dependent RI differs significantly from its expected value under the null hypothesis, 
indicating that the latter can be rejected. However, changes in the index over time are not 
statistically significant. 
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binations of v and a shown in Figure 4, none of the changes in the index 
over time are statistically signihcant at the 0.05 level of significance. 

Notes. U.S. cancer registries do not track individual or family income; 
therefore, area-level socioeconomic characteristics, linking cancer cases to 
U.S. counties, are used to get a proxy for individual-level SES [Yin et al. 
(2010), Harper et al. (2008)]. In addition to using data from the SEER Pro¬ 
gram, HP2020 objective C-10 also uses data collected through the National 
Program of Cancer Registries (NPCR), CDC, NCCDPHP. However, NPCR 
data are not as readily linked to county-level attributes as the SEER research 
data are; the latter are processed via online queries submitted securely using 
the SEER*Stat software [Surveillance Research Program (2013)]. Further, 
because cases are linked to counties from the year 2000 U.S. Census, the anal¬ 
ysis does not take into account changes in county boundaries and/or compo¬ 
sition over time. For these reasons, data and results presented here may not 
be nationally representative for the U.S.; they are intended for illustration 
purposes only. The derivation of a Taylor linearization approximation of the 
standard error of the rank-dependent RI for the various combinations of the 
parameters u and a is presented in Appendix B.2; those standard errors are 
used in significance testing for the differences in the rank-dependent RI over 
time. Because of the assumption of a Poisson distribution for crude rates, 
random draws are readily generated under the null hypothesis of indepen¬ 
dence. The resulting bootstrapped null distribution for the rank-dependent 
RI for each year and combination of the parameters v and a is summarized 
using a boxplot in Figure 4. 

5. Discussion. The rank-dependent RI introduced in this paper is a two- 
parameter class of socioeconomic health inequality indices, {Rli^^: a > 0, > 

1}, where a > 0 is a constant relative-inequality aversion parameter and in¬ 
creasing values of the socioeconomic health inequality aversion parameter 
> 1 allow groups with lower SES gradient to weigh more heavily than 
groups with higher SES. In relation to competing index classes such as 
the Makdissi-Yazbeck two-parameter extended concentration index and the 
rank-dependent reference-invariant GE class, the rank-dependent RI is more 
robust to changes in the distribution of either SES or adverse health out¬ 
comes. The proposed method is applicable to a wide range of public health 
measures and data, and statistical inference for the rank-dependent RI is 
readily implemented using standard statistical software. 

The proposed methods are easily extended into a multivariate setting. As 
mentioned earlier in the context of the partial concentration index [Grav- 
elle (2003)], it may be of interest to adjust for covariates when looking at 
disparities in health outcomes to rule out those parts of the SES disparity 
that might be considered “just” or that, otherwise, cannot be amenable to 
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policy. As an example, if communities in lower SES are relatively older and 
have higher rates of cancer for that reason, findings of socioeconomic dis¬ 
parities might be attenuated by age, so adjusting for age is of importance. 
Neighborhood-level or regional variation may be important for certain out¬ 
comes. For example, illnesses such as influenza outbreaks should adjust for 
region when measuring disparities if the outbreak is worse in certain areas 
of the country. The SEER data in Section 4.3, above, are age adjusted. One 
could also apply the proposed methodology to adjusted rates obtained from 
log-linear or logistic regression models. For example, Rossen and Talih (2014) 
apply the (symmetrized) RI to population groups obtained from propensity 
score subclassification, accounting for demographic and contextual variables 
to examine disparities in weight among U.S. children and adolescents. 

SES is a multidimensional construct that includes wealth, income, edu¬ 
cation and occupation [Talih (2013a), Krieger, Williams and Moss (1997)]. 
Income and education are used in this paper as univariate SES measures 
only for illustration purposes. The proposed methods can also be applied 
to the ranking induced from any other SES measure, including composite 
SES measures. Nonetheless, the analyst should keep in mind that measuring 
occupation as an element of SES remains challenging. Historically, several 
approaches have been used, including the Nam-Powers occupational scale 
score [Boyd and Nam (2004)], which views occupation as a reflection of 
education, skill, income and social status, as well as the National Opinion 
Research Center’s General Social Survey occupational prestige score [Nakao 
and Treas (1990)], which views occupation as an indicator of prestige. On 
the other hand, the 0*NET work content model [0*NET (2014)] can pro¬ 
vide relevant information for the measurement of socioeconomic status and 
whether certain occupations lead to reduced workplace exposure, improved 
access to health care or sick leave; see Baron (2012) for further discussion. 

Due to its derivation from a rank-dependent social evaluation function, as 
well as its origins as a measure of divergence between probability distribu¬ 
tions, the rank-dependent RI provides a unified mathematical framework for 
modeling and/or eliciting various societal positions with regards to public 
health policy. Do we favor prioritizing population groups with lower SES (in¬ 
creasing i' > 1) because, as it may be, those groups are more likely to utilize 
costly public programs? For a given priority ranking on the SES groups and 
a desired health achievement level for the population, what are the societal 
costs of nonintervention? Is it realistic to expect all groups to attain the best 
group rate (a ^ oo)? Those policy-related questions are beyond the scope 
of this paper. Rather, the aim of this paper is to provide a platform that fa¬ 
cilitates their discussion. Of course, public programs, whether costly or not, 
do not only benefit those groups with lower SES; they also benefit groups 
with higher SES. Thus, the aforementioned societal costs of nonintervention 
are not limited to deciding whether or not to have programs that impact 
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those in lower SES. Further, there are other equity arguments outside of the 
cost and benefits of policies that also could be used to justify such differen¬ 
tial weighting as in (3.2) when measuring socioeconomic health disparities 
[Wilson (2009), Braveman (2006)]. For instance, social justice principles re¬ 
main foundational in socioeconomic inequality measurement [Bommier and 
Stecklov (2002), Peter (2001)]. The analyst should be advised that, while 
a cost-benefit justification does not commit him/her to an ethical theory a 
priori, cost-benefit analyses are inherently grounded in utilitarian principles. 

Even though health disparity indices are useful in that they summarize 
the relationship between the distributions of disease burden and popula¬ 
tion shares, they do not replace in-depth scientific investigation into the 
complex causal pathways underlying various health outcomes. The value of 
health disparity indices, such as the slope index of inequality, the concentra¬ 
tion index or the proposed rank-dependent Renyi index, is best appreciated 
when comparisons between different populations as well as between different 
time periods are desired, because the alternative option of tracking multi¬ 
ple pairwise between-group comparisons over time can be prohibitive—as 
mentioned earlier, large indicator initiatives such as HP2020 can house over 
1200 health indicators. As such, health disparity indices remain essential 
for tracking the nation’s progress toward the overarching goal of achieving 
health equity. Like the slope and concentration indices, as well as competing 
index classes, the rank-dependent RI introduced in this paper accounts for 
the socioeconomic gradient in health outcomes. However, unlike competing 
index classes, the rank-dependent RI seems more stable relative to shifts in 
the underlying distributions. It also allows the analyst to be explicit about 
value judgment regarding the degree of societal aversion to health inequality 
and the differential weighting of groups relative to their socioeconomic rank. 


APPENDIX A: DERIVATION FROM WEIGHTED LEAST SQUARES 


Using the notation from Section 3 for i/ > 1, the weighted least-squares 
regression of the power-transformed outcomes faiv-j) onto the standardized 
socioeconomic rankings Wy{Rj), with weights pj, has slope and intercept 
parameters, respectively. 


b(z/, a) 


W2{u)lWi{uy -I 


and a(i^, a) = 5*(1, a) — b(z^, a). 


where S*{v,a) is the (weighted) product moment between the fa{y-j) and 
Wu{Rj)-, <S'*(1, a) is the (weighted) mean of the faiv-j), and the term — 

1 is the (weighted) variance of the Wy{Rj). (The weighted mean of the latter 
is 1.) From (3.5), it follows that 


r /g Ma(UQ) + {W2{y)/Wi{vf)h{v,a)] \ 
\ a(i/,0) + (W2(z^)/IUi(i/)2)b(i/,0) /• 
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The quantities h{iy, 0) and b(2,0) are akin to the extended and classical slope 
indices of inequality, respectively [Wagstaff (2002), Wagstaff, Paci and van 
Doorslaer (1991)]. 

Remark. An even more “convenient regression” results in the direct in¬ 
terpretation of the S*{v,a) as the slope of the line for the regression of the 
following linear transform onto the Wu{Rj): 

-1) faiy-j) + 


APPENDIX B: SAMPLING VARIABILITY 
B.l. NHANES data. Total statistics are defined as follows for any scalar 


a [Talih (2013b)]: 

S Cs Ics 

(B.l) 

Uaj = SicsjUJicsVics^ 

S=1 C=1 2=1 


M 

(B.2) 

II 

M 


1=1 


In (B.l) and (B.2), S is the number of strata; Cg is the number of PSU’s in 
stratum s; Ics is the number of sample observations in the PSU-stratum pair 
(c, s); ujics is the sampling weight for observation i in the PSU-stratum pair 
(c, s); Uics is the indicator of the adverse health outcome for observation i in 
the PSU-stratum pair (c, s); 6icsj = 1 when observation i [in PSU-stratum 
pair (c, s)] belongs to group j and 6icsj = 0 otherwise; and j ranges from 1 
to M, where M is the number of groups in the population. Using the above 
notation, we have rij/n = Uoj/Uo. and y.j = Uij/Uoj. Further, define 

TT ^ 

% = %+ E 

l=j+l 

Then (1 — Rj) = Uoj /Uo- ■ Using these total statistics, the rank-dependent RI 
in (3.9) is re-expressed. For a / 1, 


Rl(r) = In 
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For a = 1, 


= In 
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Introduce an artificial variable (Ticsk that represents the variance contri¬ 
bution from each sample observation. The Uicsk are obtained by taking the 
dot product of the vector of partial derivatives of the rank-dependent RI 
with the vector of summands in the total statistics Uok and Uik'- 


(B.3) 


^icsk — ^icsk'^ics 


I dUok 


T Vies' 


dRl^a'^ 

dUik 


An estimate of the sample variance of Rli^^ is given by the sampling vari¬ 
ance of the total statistic o'icsfc- The latter is available using 

design-based estimation of variances of totals (“svytotal”) in the R package 
“survey” [Lumley (2004, 2011), R Development Core Team (2011)]. 

Expressions for partial derivatives with respect to Uq^ and Uik ■ 

0, if j > k, 

dVoj 
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if j < k, 


and 


dUik 


= 0 , 


d 


dUok 

d 

d 


(I) = 


(I) = 


((z. - l)/2)t/i,R-^ + (^ - 1) E;=o 


y-2 




Vr 


v-l 

Ok 




u—1 ' 

oj 


(II) = I a(t/ifc/t/ofc)'-“Ro"-' + 


dUok 


Oj 


0-1 


UokiUik/UokY-'^Vr 


u-2 

Ok 


k-1 

+ {o-l)Y,UoYUij/Uo_ 

j=0 




d 

dUik 


( 11 ) 


d 

dUok 


(III) 



{l-a){Uik/Uok)-^VYk-" 

ypV + ((^ - YmuokVoV^ + (^ -1) E.td uqjVqT^ 
Ef=iUo,vY-^ 






















28 


M. TALIH 


d 


dUik 

d 

dUok 


(III) = 0, 
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Notes. NHANES has a stratified multistage probability sampling design 
structure [Johnson et al. (2013)]. While the sample weights provided in the 
NHANES public-use data files reflect the unequal probabilities of selection, 
they also reflect nonresponse adjustments and adjustments to independent 
population controls. Therefore, strictly speaking, they are not the true sam¬ 
pling weights Wics in (B.l). 


B.2. SEER data. Following SEER*Stat [Surveillance Research Program 
(2013)], crude rates are assumed to be distributed according to a Poisson 
distribution. In addition, age-adjusted rates are adjusted using the year 2000 
U.S. standard population, with known age-adjustment weights ojk and sizes 
Ukj. Thus, sample means and variances for the age-adjusted rates are as 
follows: 


K K 

= '^‘^kU.kj and Var[y.j] = '^ojlu.kj/nkj, 
k=l k=l 
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where the u.kj are the underlying crude rates for age group k. Using the 
expression in (3.9), we have 



dy.j ^ 


The Taylor series linearization approximation to the variance of the rank- 
dependent RI yields 
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